30 research outputs found

    Lipschitz Optimisation for Lipschitz Interpolation

    Full text link
    Techniques known as Nonlinear Set Membership prediction, Kinky Inference or Lipschitz Interpolation are fast and numerically robust approaches to nonparametric machine learning that have been proposed to be utilised in the context of system identification and learning-based control. They utilise presupposed Lipschitz properties in order to compute inferences over unobserved function values. Unfortunately, most of these approaches rely on exact knowledge about the input space metric as well as about the Lipschitz constant. Furthermore, existing techniques to estimate the Lipschitz constants from the data are not robust to noise or seem to be ad-hoc and typically are decoupled from the ultimate learning and prediction task. To overcome these limitations, we propose an approach for optimising parameters of the presupposed metrics by minimising validation set prediction errors. To avoid poor performance due to local minima, we propose to utilise Lipschitz properties of the optimisation objective to ensure global optimisation success. The resulting approach is a new flexible method for nonparametric black-box learning. We provide experimental evidence of the competitiveness of our approach on artificial as well as on real data

    Conservative collision prediction and avoidance for stochastic trajectories in continuous time and space

    Full text link
    Existing work in multi-agent collision prediction and avoidance typically assumes discrete-time trajectories with Gaussian uncertainty or that are completely deterministic. We propose an approach that allows detection of collisions even between continuous, stochastic trajectories with the only restriction that means and variances can be computed. To this end, we employ probabilistic bounds to derive criterion functions whose negative sign provably is indicative of probable collisions. For criterion functions that are Lipschitz, an algorithm is provided to rapidly find negative values or prove their absence. We propose an iterative policy-search approach that avoids prior discretisations and yields collision-free trajectories with adjustably high certainty. We test our method with both fixed-priority and auction-based protocols for coordinating the iterative planning process. Results are provided in collision-avoidance simulations of feedback controlled plants.Comment: This preprint is an extended version of a conference paper that is to appear in \textit{Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014)

    Fast Agent-Based Simulation Framework of Limit Order Books with Applications to Pro-Rata Markets and the Study of Latency Effects

    Full text link
    We introduce a new software toolbox, called Multi-Agent eXchange Environment (MAXE), for agent-based simulation of limit order books. Offering both efficient C++ implementations and Python APIs, it allows the user to simulate large-scale agent-based market models while providing user-friendliness for rapid prototyping. Furthermore, it benefits from a versatile message-driven architecture that offers the flexibility to simulate a range of different (easily customisable) market rules and to study the effect of auxiliary factors, such as delays, on the market dynamics. Showcasing its utility for research, we employ our simulator to investigate the influence the choice of the matching algorithm has on the behaviour of artificial trader agents in a zero-intelligence model. In addition, we investigate the role of the order processing delay in normal trading on an exchange and in the scenario of a significant price change. Our results include the findings that (i) the variance of the bid-ask spread exhibits a behavior similar to resonance of a damped harmonic oscillator with respect to the processing delay and that (ii) the delay markedly affects the impact a large trade has on the limit order book

    Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets

    Full text link
    We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilise it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximise its trading return in this environment, we use Deep Duelling Double Q-learning with the APEX (asynchronous prioritised experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilising synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.Comment: 10 page

    Control predictivo basado en datos

    Get PDF
    Este artículo presenta el diseño estabilizante de un controlador predictivo a partir únicamente de datos de entrada-salida de un sistema a controlar. El modelo que incluye este controlador es una función no lineal estimada usando una técnica de aprendizaje automático no paramétrica conocida como Kinky inference. Como primeras pruebas en esta nueva línea de investigación, se diseñan y se prueban controladores en un reactor continuamente agitado, considerando las cuestiones necesarias para una correcta implementación práctica.MINECO (España) DPI2013-48243- C2-2-RFEDER/UE DPI2016-76493-C3-1-

    Online learning constrained model predictive control based on double prediction

    Get PDF
    A data-based predictive controller is proposed, offering both robust stability guarantees and online learning capabilities. To merge these two properties in a single controller, a double-prediction approach is taken. On the one hand, a safe prediction is computed using Lipschitz interpolation on the basis of an offline identification dataset, which guarantees safety of the controlled system. On the other hand, the controller also benefits from the use of a second online learning-based prediction as measurements incrementally become available over time. Sufficient conditions for robust stability and constraint satisfaction are given. Illustrations of the approach are provided in a simulated case studyFeder (UE) DPI2016‐76493‐C3‐1‐RUniversidad de Sevilla VI‐PPITMinisterio de Economía y Competitividad (MINECO). España DPI2016‐76493‐C3‐1‐

    Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

    Get PDF
    We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal

    Conservative decision-making and inference in uncertain dynamical systems

    No full text
    The demand for automated decision making, learning and inference in uncertain, risk sensitive and dynamically changing situations presents a challenge: to design computational approaches that promise to be widely deployable and flexible to adapt on the one hand, while offering reliable guarantees on safety on the other. The tension between these desiderata has created a gap that, in spite of intensive research and contributions made from a wide range of communities, remains to be filled. This represents an intriguing challenge that provided motivation for much of the work presented in this thesis. With these desiderata in mind, this thesis makes a number of contributions towards the development of algorithms for automated decision-making and inference under uncertainty. To facilitate inference over unobserved effects of actions, we develop machine learning approaches that are suitable for the construction of models over dynamical laws that provide uncertainty bounds around their predictions. As an example application for conservative decision-making, we apply our learning and inference methods to control in uncertain dynamical systems. Owing to the uncertainty bounds, we can derive performance guarantees of the resulting learning-based controllers. Furthermore, our simulations demonstrate that the resulting decision-making algorithms are effective in learning and controlling under uncertain dynamics and can outperform alternative methods. Another set of contributions is made in multi-agent decision-making which we cast in the general framework of optimisation with interaction constraints. The constraints necessitate coordination, for which we develop several methods. As a particularly challenging application domain, our exposition focusses on collision avoidance. Here we consider coordination both in discrete-time and continuous-time dynamical systems. In the continuous-time case, inference is required to ensure that decisions are made that avoid collisions with adjustably high certainty even when computation is inevitably finite. In both discrete-time and finite-time settings, we introduce conservative decision-making. That is, even with finite computation, a coordination outcome is guaranteed to satisfy collision-avoidance constraints with adjustably high confidence relative to the current uncertain model. Our methods are illustrated in simulations in the context of collision avoidance in graphs, multi-commodity flow problems, distributed stochastic model-predictive control, as well as in collision-prediction and avoidance in stochastic differential systems. Finally, we provide an example of how to combine some of our different methods into a multi-agent predictive controller that coordinates learning agents with uncertain beliefs over their dynamics. Utilising the guarantees established for our learning algorithms, the resulting mechanism can provide collision avoidance guarantees relative to the a posteriori epistemic beliefs over the agents' dynamics.</p
    corecore